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Part  I 
1.   Introduction 

A  central  methodological  issue  In  the  discovery  of  oil  and  gas  deposed 
in  sediments  within  a  geological  basin  is:   how  should  a  sample  of  discovered 
deposits  be  used  to  make  inferences  about  the  distribution  of  sizes  (in 
barrels  of  oil  or  in  cubic  feet  of  gas)  of  undiscovered  deposits  and  about 
the  number  remaining  undiscovered? 

Viewed  as  a  sampling  process,  the  process  of  observing  pool  (deposit) 
sizes  in  order  of  discovery  is  more  akin  to  sampling  without  replacement 
and  proportional  to  size  than  to  sampling  values  of  independent,  identi- 
cally distributed  random  variables.   This  feature  has  been  generally  ignored 
in  published  statistical  studies  of  the  size  distribution  of  mineral 
deposits  (cf.  Krige  (1951),  Allais  (1957),  Kaufman  (1962),  McCrossan  (1969) 
for  example) . 

More  specifically,  it  is  reasonable  to  postulate  that  discovery  sizes 
in  order  of  observation  are  generated  by  sampling  without  replacement  from 
a  finite  population  of  pools  whose  sizes  (area,  volume)  are  generated  by 
yet  another  random  process;  i.e.,  the  finite  population  of  pools  is  a 
random  sample  from  a  hypothetical  infinite  population  (a  super-population) 
whose  size  distribution  is  of  known  functional  form.   This  characterization 
of  sampling  from  a  finite  population  is  well  known  in  the  statistical 
literature  and  has  been  used  to  develop  classical,  fiducial,  and  Bayeslan 
procedures  for  estimation  of  finite  population  parameters  (cf.  Cochran  (1939), 
Fisher  (1956),  Erlcson  (1969),  and  Palit  and  Guttman  (1973)  for  example). 
Suppose,  however,  that  the  sample  drawn  from  the  finite  population  Is  random, 
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without  replacement  and  proportional  to  size  in  the  sense  that  the  proba- 
bility of  observing  the  ith  finite  population  element  at  the  j th  sample 

observation  is  equal  to  the  ratio  of  the  size  A.  of  that  element  to  the 

1 

sum  of  sizes  of  as  yet  unobserved  finite  population  elements.   Then,  although 
the  general  framework  is  relevant,  unfortunately  none  of  the  specific  tech- 
niques developed  in  the  literature  just  cited  can  be  directly  applied.   An 
additional  complication  is  that  the  number  of  elements  in  the  finite  popu- 
lation is  generally  not  known  with  certainty  in  this  particular  problem. 

Our  purpose  here  is  to  analyze  the  implications  of  the  juxtaposition 
of  the  following  two  assumptions  about  the  manner  in  which  discoveries 
are  observed: 

T.   Nature  generates  values  A  , . . . ,A^  of  a  sequence  of  N  mutually 

independent  random  variables  (rvs)  identically  distributed 

with  common  density  f  concentrated  on  [O.^")  . 

Elements  of  the  finite  set  Q  =  {A  , . . . ,A^}  are  sampled  without  replacement 

and  proportional  to  size;  i.e., 

II.   The  probability  of  observing  A   ,...,A   ,  n  <  N,  in  that  order 

1       n 

is,  upon  relabelling  elements  of  Q„  so  that  (i, ,...,i  )  =  (l,2,...,n), 

N  in 


LL 

P{(l,2,...,n)|Q  }  =  n  A./(A.+. ..+A^) 

i=l  J    J 


J  = 

While  superficially  simple  in  form,  assumption  II  leads  to  a  rather  compli- 
cated density  for  sample  observations,  one  whose  properties  differ  greatly 
from  those  implied  by  assuming  that  each  element  of  Q   remaining  unobserved 
has  an  equally  likely  chance  of  being  observed  next.   For  example,  when 
the  superpopulation  is  exponential  the  mean  of  the  nth  observation  decreases 
linearly  with  n. 

Let  Y.  denote  the  observed  value  of  the  ith  observation,  define 
J 


Y  =  (Y  ,...,Y  )  as  the  vector  of  observations  In  a  sample  of  size  n  ^  N, 

and  assume  that  f  is  a  member  of  a  class  of  densities  (all  of  whose  members 

are  concentrated  on  [0,°°))  indexed  by  a  parameter  6^  e  0  so  that  A.  has 

density  f(*|9).   Then  given  6,  N,  and  infinitesimal  intervals  dY, ,...,dY  , 

~  In 

and  defining  b.  =  Y.+...+Y  ,  the  probability  of  observing  Y,  e  dY, , . . . ,Y   £  dY 
2  3  n      "^         ^  *ll''nn 

in  that  order  (or  equivalently,  of  observing  Y  e  dY)  is 

P{Y    e   dY|£,N}    = 
n  ooooj^  jj 

r  N-nli)    "    Y  f(Y  |e)dY    /.../    n     [b  +A      +...+A^]-i      n      f(A  |e)dA^        (1.1a) 

^  j=l     ^        J  J    o        0  j  =  l        -"  k=n+l 

~  ~  -  *N-n  -  ~  r 

Defining  S^_^  =  A^_^^+.  .  .+A^  and   f  as   the   density  of   S„_    ,    P{YedY|e^,N} 

may   also   be  written  as 

[S-nIn      "     Y   f(Y    |9)dY      /   f*^""(s|e)    H      [b  +S]"^dS  (1.1b) 

Given  0^,N,  and  that  Y  =  Y  has  been  observed,  the  density  of  the  sum 

S„   of  unobserved  elements  of  Q„  and  the  density  of  the  sum  R^  =  S    + 
N-n  ^N  -^  N    N-n 

Y  +. . .+Y  of  elements  of  Q  follow  directly  from  (1.1).   The  former  density 


is 


*N-         ^  -1 

K(Y)f  ^  "(s|e)  n  [b.+s]  (1.2) 

j=i  ^ 


for  S  e  [0,°°)  and  zero  elsewhere.   Here 

00  Jl 

[K(Y)]'^  =  /  f*^""(s|e)  n  [b.+s]"^ds.  (1.3) 

0        j=i   ^ 

The  density  of  R  given  £,N,  and  that  Y  =  Y  is  observed  is 

K(Y)f  '^  "(R-b^|e)   n   [R-b^+b.]  ^  (1.4) 

j  =  l        -^ 


for  R-b^  >  0  and  zero  otherwise.   In  the  application  of  this  model  to  the 
problem  of  estimating  undiscovered  oil  and  gas,  R^  and  S    are  of  central 
interest;  R^^  is  interpretable  as  the  total  volume  of  hydrocarbons  in  a 
petroleum  zone  and  S     is  interpretable  as  the  volume  in  it  remaining 
undiscovered  after  the  first  n  discoveries.   From  the  geologist's  view- 
point, R^  is  viewed  as  an  unknown  physical  quantity  whose  value  may  be 
crudely  estimated  by  a  variety  of  techniques,  the  most  frequently  used 
being  an  extrapolation  of  the  volume  of  hydrocarbons  already  discovered 
per  cubic  mile  of  sediment  explored  to  the  total  volume  of  sediment  in 
the  geological  horizon  or  zone  constituting  the  play.   We  shall  present 
statistical  methods  in  keeping  with  the  nature  of  our  model. 

To  evaluate  the  economic  desirability  of  investing  in  exploratory 

effort  in  a  petroleum  zone  after  n  discoveries  of  sizes  Y,  , . . .  ,Y   have 

1      n 

been  made,  the  density  of  sizes  Y   , . . . ,Y  of  future  discoveries  given 
i,N,  and  Y  =  Y  is  needed. 

We  begin  by  describing  features  of  our  model  under  the  assumption 
that  the  superpopulation  density  f  (*  |6.)  is  exponential.   The  advantages 
are  obvious:   properties  of  the  quantities  S    ,  R^,  and  Y   ^  given 
Y  , . . . ,Y   are  exactly  computable  and  this  case  serves  well  as  a  descrip- 
tive foil,  laying  bare  in  a  simple  setting  the  model's  key  features. 

Section  3  is  a  study  of  the  model  when  the  superpopulation  density  is 

any  density  concentrated  on  [O,").   An  integral  representation  of  the  density 

of  Y^  useful  for  doing  asymptotic  analysis  of  it  is  given  first.   Then  a 

different  integral  representation  designed  for  doing  numerical  computation 

of  the  sampling  density  of  Y_  is  presented.   Representations  of  similar  types 

of  the  marginal  moments  (when  they  exist)  of  Y   and  the  conditional  moments 

n 

of  Y    given  Y^    =  Y^,...,Y     =  Y   follow.   The  correlation  structure  of  the 
n+q        i    i      n    n 


Y.s  is  computed.   Asymptotic  approximations  to  the  expectation  of  Y   are  dis- 
cussed and  a  numerical  example  given  for  the  case  when  f (■  |6^)  is  lopnormal; 
i.e.  E(Y  )  for   n  =  1,...,-tN  is  computed  to  six  digits  accuracy  and  the  per- 
centage errors  of  the  approximations  are  computed  in  turn.   Section  3  closes 
with  a  statement  of  two  asymptotic  approximations  to  the  sampling  density 
valid  for  fixed  Y  and  n  and  large  N-n. 

The  case  when  f (• |^)  is  lognormal  is  of  great  practical  importance  for 
resource  estimation  since  it  is  widely  assumed  that  the  size  distribution  of 
many  types  of  mineral  deposits  as  deposed  by  nature  is  lognormal,   This 
assumption  can  be  empirically  supported  by  examining  pool  size  distribu- 
tions for  very  intensively  explored  petroleum  zones  (cf.  Arps  and  Roberts 
(1958),  Kaufman  (1962),  and  McCrossan  (1969)).   For  example,  Figure  1  from 
McCrossan  (1969)  is  a  graph  of  sizes  of  Leduc  reef  pools  plotted  on  lognormal 
probability  paper.   Unfortunately,  computations  in  this  case  are  quite  com- 
plicated.  The  density  of  Y  and  the  moments  of  Y  regarded  as  a  function  of 
the  sample  size  n  change  form  as  n  increases; 

Figure   1 
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that  is,  there  Is  a  turning  point  at  n  ~/N,    and  the  form  of  these  functions 
changes  character  as  n  passes  through  it.   Figure  2  is  a  graph  of  marginal 
expectations  of  discovery  sizes  generated  by  simulating  values  of  the  Y.s, 
assuming  that  E(log  A  )  =6.0  and  Var(log  A  )  =  3.0  and  that  N  =  100,  150, 


300,  600,  1200. 


Figure  2 
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(Graphs  of  E(Y  )  for  N  =  150,  300,  600  based  on  numerical  computation  using 
the  exact  integral  representation  (3.4  )  are  presented  in  Figure  4  along  with 
a  discussion  of  moments  for  the  general  case.) 

In  Part  II,  sections  4  through  8,  the  case  of  lognormal  f  is  studied  in 
detail.   An  asymptotic  expansion  of  the  sampling  density  uniformly  valid  in 
a  neighborhood  of  the  turning  point  is  given  in  section  4  along  with  an  out- 
line of  the  procedure  used  to  compute  it.   The  procedure  is  presented  in 
section  5.   It  is  shown  in  section  6  that  the  uniform  expansion  connects  to 


the  expansions  [(3.11)  and  (3.13)]  given   in  section  3  and  in  so  doing  an 

improvement  [(6.5)]  of  (3.11)  appears.   Section  7  offers  an  approximation 

to  the  conditional  expectation  of  Y  . ,  given  Y  =  Y  , .  .  •  ,Y,  =  Y,  expressed 

n+1  ''      n    n      1    1        ^ 

in  terms  of  a  ratio  of  densities. 

Ta  section  8,  using  as  sample  data  the  sizes  of  60  fields  discovered  in  the 

North  Sea,  we  display   iso-contour  plots  of  an  (approximate)  likelihood 

function  for  lognormal  parameters  \i   and  o^   given  fixed  finite  population  size 

N  using  the  uniform  expansion  (4.7).   A  graph  of  the  likelihood  function  for 

N  given  a  fixed  pair  of  values  of  y  and  o^   is  also  shown.   Finally,  the  graph 

of  an  approximation  to  expectations  of  yet  to  be  observed  Y.  ....,Y„.  civen 
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observed  values  of  Y  , . . . ,Y   —  a  forecast  of  future  sizes  of  discovery  in 
order  of  occurrence  —  is  displayed. 

A  subsequent  paper  will  discuss  numerical  computation  of  the  sampling 
density,  of  moments  of  observations,  and  their  coirrelation  structure,  and 
estimation  methods  for  this  sampling  model  with  eirphasis  on  lognormal  super- 
populations. 


2.   Exponential  Super-Population 

When  f  Is  an  exponential  (or  ganma)  density,  the  sampling  density 
of  observations  Y  can  be  expressed  in  terms  of  incomplete  gamma  functions 
and  moments  of  the  Y.'s  can  be  calculated  explicitly.   Letting  f(x|6^)  E 
Xexp{-Xx},  A>0,  the  density  of  Y  given  9^  and  N  but  unconditional  as 

regards  S„   may  be  found  directly  from  (1.1):   recognizing  that  the 

N-n 

*N-n   I 
marginal  density  f    (S  1 6^)  of  the  sum  S    of  unobserved  finite  population 

elements  is  a  gamma  density  and  doing  a  partial  fraction  expansion  of 

""       -1 

n  [b.+S]   ,  we  find  after  term  by  term  integration  over  S  >  0  that  the 

j  =  l  ^ 

density  of  Y  is 

,N-n„,    .    u    -X.Y.   n    -Ab    ,     . 

TopSiT^  '."/j^    '  '\l    V    V""~''  r(-<N-„-i,,.v  (2.1) 

j=l         k=l 


where 

n 


,~1  ^^A      r/_/'M i\  \u    \    -    r  _-t^-(N-n)^ 


Cj^  =  n   (b^-bj^)    and  r(-(N-n-l)  ,Abj^)  =  /^   e  t' dt, 

i=l 

The  kth  moment  of  the  nth  observation  is 


„.~k  _  (N-n-fk)  .  .  ■  (N-n+1)  ,  r(k+2)  ,„  „, 

''^n''      (N+k)  .  .  .  (N+1)       ^k  ^^•^' 


and  in  particular 


^(V=!fi-iJri'  (2.3) 


^^^n^   =f  [1-^lfl-^]'  (2.4) 


^(\V  =1-  tl-^]  fl-^JfeJ   fork>.,  (2.5) 

^-(\'V  =  a4nIi)'!n12)   f-  '^  '   '"'  (2.6) 


and 


^-^V  =i^^(V-^wryToi^  (2.7) 

As  N  ->  «>,  E(Y  )  ->  2/A,  Var  (Y  )  ^  2/A^,  and  Cov(Y,  ,Y  )  -v  0  so  that  for  n 
n  n  km 

fixed  and  N  large,  the  sampling  process  behaves  approximately  as  if  one 
were  sampling  independent  variables  with  common  density  (Ax)exp{-Ax}/M  , 

M^  =  1/A. 

^   1  ?  ~ 
The  mean  and  variance  of  the  sample  mean  Y  =  —  )   Y.  follow  directly 

"j=l   ^ 

from  (2.3)-(2.7),  e.g.,  E(Y)  =|  [1  -  ^^]   =^  [1  +  |g]  • 

Writing  the  density  of  Y    given  Y  =  Y  as  the  ratio  of  the  density 
of  ("^-+1  »X)  to  that  of  Y,  and  mimicing  the  computation  of  the  marginal 
moments,  we  find  that  the  expectation  of  the  kth  moment  of  Y    given 


Y  =  Y  is 


r(k+2)        k   »N+k,n'^^\  ,      Q. 

Ak    ^^   N-n+k  H^  ^(Y)   J  ^^-"^ 


where 


°°  n         1   -x  N-n-1 

-'ex      , 

dx. 


H^   (Y)  =  /  (  n  [x+Ab  ]-  }  ^-^ 
o  j=l     -^ 


For  fixed  b.  and  large  N-n  the  leading  term  in  an  asymptotic  expansion  of 
H^   (Y)  is   n  [N-n+Ab.]   ,  so  if  we  define  b  , ,  =0,  (2.  8)  may  be  approxi- 

j=i 

mated  by 

Li^±^  "n  ri ^ 1  (2  9) 

,k    .\     ^^        N-n+k+Ab.^'  ^^-     ' 


10 


"       _1 

That  the  leading  term  is   II  [N-n+Ab . ]    is  established  in  a  more  general 

j=l       ^ 

setting  in  section  3.   However,  by  rewriting 

n  _i   „-(N-n)yr/„_^,,TN-n-l 

j  = 


«•   n  _i    -(N-n)y,,    .  ,N-n-l 

\  Jl)    =   I      (  n  [(N-n)y+Xb  ]   }  ^ T(N-n)     dy(N-n) 

'        o   1=1  •J 


1 


H^   (Y)  is  expressed  as  the  expectation  of  a  function  11  [(N-n)y  +  Ab . 

J=l 

with  respect  to  a  gamma  density  having  mean  one  and  variance  1/N-n,  so 
that  for  large  N-n  and  fixed  b.'s  it  is  clear  that  the  major  contribution 
to  the  value  of  the  integral  must  come  from  y  in  g  neighborhood  of  one. 

If,  in  place  of  an  exponential  density,  f('|6.)  is  a  gamma  density 
A  exp{-Ax}(Ax)'^"-'^/r(r), 

.~k  =  r(k+r+l)     r(N+l)     r(N-n+l+[k/r])  ^    1_  /.  i  fl) 

^  n^    r(r)      r(N+l+[k/r])     r(N-n+l)       ^k    ^    '       ' 


In  Figure  3  graphs  of  E(Y  )  for  N  =  600,1200  and  f(*|^)  gamma'  are  compared 

with  graphs  of  E(Y  )  for  f(*|9)  lognormal.   Parameters  A  and  r  are  chosen  so 

n 

that  the  mean  r/A  and  variance  r/A   of  the  gamma  density  match  the 
mean  and  variance  of  a  lognormal  f(*|6^)  with  y  =  6.0  and  o  =  3.0.   The 
difference  in  behavior  is  pronounced. 
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3 .   Properties  of  the  Sampling  Density 
The  density  of  Y  given  6^  and  N  may  be  written  as 

^N,n(I\Vj'^'j'-^ 
J=l 

where,  letting  b.  =  Y.+.,.+Y  , 
J    J      n 

OO       OO    jj  1     N 

^N,n(^>  =  Y(^   /  •••/   ."  fVVl+---^\l"  ,  n   f(A^|e)dA^.       (3.0) 

0    o  j=l  k=n+l 

Except  in  the  simplest  of  cases,  I    (Y)  possesses  no  simple  representation. 
We  present  two  integral  representations  of  It,  the  first  useful  for  computing 
both  uniform  and  non-uniform  asymptotic  expansions  of  the  density  of  Y,  the 
second  useful  for  numerical  computation  of  it,  and  for  computation  of  the  con- 
ditional moments  of  Y    given  (Y  , . . . ,Y  )  =  (Y  , . . .  ,Y  )  as  well.   They  are 
expressed  in  terms  of  either  the  characteristic  function 

oo 

G(y)  =  /   e'^^^'fCxlDdx 


of  f  or  in  terms  of  its  Laplace  transform  G(-iy)  =  L(y) 

2-rT 


n.     .         I- ,        s         If        -iu(y-x). 
Assertion   1:      Defining  o(x-y)   =  —  J      e  du. 


—00 


Vn®  '  tSSIHO  ■■('''- Jl"i''"l- 


oo  n 

X    /     y""^exp{-iy   I     a.b . } [G(y) ]^""dy . 
o  j=l     ^    ^ 


.da 
n 


(3.1) 


Note  that  I„   is  real. 

N,n 


13 


n  -1 

If  we  decompose      II      [b.+A    ,i  +  '--+A^,]        by  partial   fractions. 


•compose      II      [b.+A    ,  ,  +  •  •  •'^\,. 
j  =  l 


n  [b+A^^^+...+A^]      =    [    ^^ 
J=l    ■  J=l    J 

Using  the  identity 

-1        00 


f'j 


.  -r]r   with  p.  =  n  [b.-b.] 
n+i+---+\      J   i=l  ^  J 

±^3 


f^-'Vl^'-'+^l    =/  exp{-A(b.+A^^^+...+A^) 


}dX 


for  each  term  of  the  partial  fraction  decomposition  and  integrating  over 
the  As,  we  have  that 

oo 

\   nd)  =  /  Z(A)[L(A)]^""dA 

'         0 

where  ,, 

n     -Ab. 

Z(X)  =  I      p  e    \ 
j  =  l  ^ 

At  A  =  0,  the  first  n-2  derivatives  of  Z(A)  are  zero  and  the  (n-l)st 
derivative  Z^""''"\a)  =  1.   Hence  for  small  A,  Z(A)  -  [A""-'-/(n-l) !  ]  +  0(A")  . 
When  computing  I    (Y^)  numerically  in  cases  such  as  that  presented  in  section 
8  (North  Sea  Fields,  n=60) ,  the  partial  fraction  coefficients  p.  can  differ 
from  smallest  to  largest  by  a  factor  as  large  as  10^^,  so  that  a  large  number 
of  cancellations  may  occur  when  A  is  small.   Consequently  Z(A)  must  be  com- 
puted with  extreme  accuracy  (80-100  digits)  in  order  to  attain  12-16  digits 

accuracy  for  I,,   (Y) .   This  feature  manifests  itself  in  each  of  the  repre- 
N,n  — 

sentations  of  I_,   (Y)  we  present. 

N,n  — 


lA 


An  alternative  form  for  I.,   (Y)  is  to  write  it  as  (proportional  to) 

N,n  — 

oo  ioo  n 

■—  /      [L(A)]^""   /      exp{Az}      n    [b.+z]"^dzdA, 

2iTi  ■*         i        .  T  1 

o         -it"        j=l 

where  L  is  the  Laplace  transform  of  f(*|6^).   Rotating  the  contour  by  90  , 
the  integral  in  z  becomes 

oo  n         1       1   °°  "        -1 

^/   (exp{-iAx}  n  [b.-ix]"  )  dx  +  ^  /   (expdAx)  H  [b  +ix]   )dx. 


- 1  -p 

or,  as  [b.+ix]   =  [b?  +x^]  ^exp{-i  arctan(x/b . ) } , 


1       °°  "  "  Jr 

-Re/      (exp{i[Ax  -     Z     arctan(x/b .) ] }/   H    [b^+x^ ]^)dx 
""  o  j=l  ^  j=l     ^ 


2-^2  J -%dx. 


=  -  /     cos (Ax-   Z      arctan(x/b.))      H      [b!+x 
j=l  ^        j=l       ^ 


This  last  representation  is  most  useful  for  computing  values  of  the  density 
of  Y  numerically. 


Assertion  2:   I    (Y)  possesses  the  representation 

N,n  — 

oo  oo        n  n         J 

-^T^ii^  /   [L(A)]^""  /  cos  (Ax-  E  arctan(x/b . ))   n   [b^+x^  ]"^dxdA   (3.2) 
Trr(N-n+l)  J^  Q       ^=1  J   j=l   3 

Further  calculation  yields 

Assertion   3:      The   kth  moment   E(Y\jYedY)    of  Y  ^^    given    (Y^  ,  .  .  .  ,Y   )    =  YedY 

n+1'—    —  n+1  1  n  —     — 

TTr(N-n+l)    ^^N  n^-^l"'-^     L^^^A)     [L(A)]^"""^   /  sin(Ax-   Z      arctan(x/b . ) )    H    [b^+x^j'^dx 
'  0  0  j=l  j=l  x 

(3.3) 
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The  marginal  moments  of  Y  possess  a  simple  integral  representation 

in  terms  of  the  Laplace  transform  L(A)  of  f: 

~  k 
Assertion  4:   The  expectation  of  Y   is 
n 

oo 

E(Y^^  =  n(JJ)  /  L^'^'^^^A)  [L(A)]^~"[l-L(X)]""^dX,  (3.4) 

0 


where 


9^X  0 


In  special  cases,  (3.4)  can  be  used  to  compute  exact  expressions  for 

moments.   For  example,  if  f(*|0_)  is  gamma  with  Laplace  transform 

L(X)  =  (1+X)"'', 

1 
L"(A)/L'(X)  =  -(r+1)  (1+X)   =  -(r+l)[L(X)]^  and 


1 

E(Y  )  =  (r+1)  n  (^)  /   [L(X)  ]^"""^  [l-L(A)  ]""  V(X) 
n  n 

0 


r    ■  1  \    /Nx  r   N-n+-  ,,   sU-l, 
=  (r+1)  n  (  )  j   t    r  (1-t)    dt 
n 

0 


r(N+l)  r  (N-n+^+l) 

=  (r+1)  ^ (3.5) 

r(N-n+l)  r  (N+^+l) 


r 


which  is  (2.2)  when  X  =  k  =  r=l. 
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When  L(A)  cannot  be  expressed  in  closed  form,  E(Y   )  can  be  approxi- 

n 

mated  for  large  N  using  the  fact  that  n(^)  [L(A)  ]'^~"[1-L(X)  ]"~"'"dL(A) 
behaves  like  a  beta  density.   For  example,  defining  L(A)  =  t  and 
L"(A)/L'(A)  =  H(t)  =  H(L(A)), 

E(Y  )  =  n  (^  /'  H(t)t^-"(l-t)"-^dt.  (3.6) 

n       n 

0 

The  beta  density  N(^)  t^~"(l-t )"""'"  has  mode  (N-n)/(N-l),  mean  t  =  (N-n+1) /(N+1)  , 

n 

and  variance  t(l-t)/(N+2)  <  ,  .   „.  .   Consequently,  for  large  N,  the 
density  is  sharply  peaked  and  the  major  contribution  to  (3.6)  comes  from 
the  vicinity  of  the  mean  t.   This  suggests  the  approximation 

E(Y^)  =  H(t).  (3.7) 

_l 
In  the  exponential  case,  it  is  exact:   t  =  (1+y)   so  H(t)  =  2/(l+y)  =  2t 

implying  that  H(t)  =  2[1-  -77-]  as  in  (2.3). 

N+1 

When  L(A)  is  not  expressible  in  closed  form,  compute  A  such  that 

_i  _    _ 
L   (t)  =  A,  and  set  H(t)  =  L"(A)/L'(A).   When  f(*|e)  is  lognormal  with 

mean  exp{p+%a^}  ±  M^ ,  second  moment  exp{2y+2a^}  E  M„,  and  variance 

M^2[exp{a2}-1].  ^     _  ^^2 

E(Y  )  ^  -^  ■  ^^^\.  ^  (3.8) 

1   L(Ae  ) 

Integrating  once  by  parts  yields  an  approximation  to  E(Y  )  slightly 
different  than  (3.8):    letting  A   be  such  that  (N-n+1) /N  =  L(A  )  and  A 
be  such  that  (N-n) /N  =  L(A  ), 

2  2 

E(Y  )  ^  N{L'(A„)  -  L'(A,)}  =  NM,{L(A^e'^  )  -  L(A-e^  )}  (3.9) 

n         I  1       1    i         L 


Setting  A„  =  A  +A,  when  N  is  large. 


A  =  -1/Nl'(A^)   so  that   N{L'(A2)  -  L'(A^)}  =  -L"(A^) /L' (A^) 
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Figure  4  displays  graphs  of  E(Y  )  for  a  lognormal  superpopulation  with 
E(log  A.)  =  y  =  6.0  and  Var(log  A.)  =  3.0.   The  function  L(A)  in  the  integrand 
of  the  expression  (3.4)  was  computed  to  12  digits  accuracy  by  80  point 
Gauss-Hermite  quadrature  and  a  table  of  these  values  used  in  the  external 
integration;  the  latter  was  done  to  6  digits  accuracy.   In  Table  I,  relative 
errors  of  approximations  (3.8)  and  (3.9)  are  shown  for  N  =  150,  300,  600. 

For  n  ^  .02N  (3.8)  yields  values  uniformly  less  than  E(Y  )  and  (3.9)  yields 

n 

values  uniformly  greater  than  E(Y  ) ,  so  a  simple  average  of  (3.8)  and  (3.9) 
produces  a  better  approximation  than  (3.8)  or  (3.9)  alone. 

M   UXj^e^''   ) 
1  L(A^e   ) 

is  a  cruder  approximation  that  conforms  to  (3.7). 

Applying  the  method  used  to  calculate  the  marginal  mean  of  Y   and 

n 

repeatedly  interchanging  orders  of  integration  we  arrive  at  the  following 

integral  representations  for  the  expectation  E(Y  ,  lYedY)  of  Y  ,   given 

n+m  —  —      n+m 

Y, ,...,Y   and  the  expectation  E(Y  ^  Y  ^  ^lYedY)  of  Y  ,  Y  ,  ,   given  Y, , . . . ,Y  : 
In  '^  n+q  n+q+m'—  —      n+q  n+q+m  ^      1      n 

n    -Cb . 
Assertion  4b:   Defining  Z(^)  =  J  p.e   -' ,  b .  =  Y.  +  ...+Y  ,  the  expectation 

j=l  ""        J    J 

of  Y  ,   for  m  =  1,2,...   given  YedY  is 
n+m  ^  — 


E(\+^liedY)  =m  (M  {/   [L(A)]"-"z(A)dX}-^ 

°°       N-(n+m)       A  m-1 

X  /   [L(A)]       L"(A)  /  Z(C)  [L(0    -   L(A)]    d^dA 
0                       0 


(3.10a) 
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and  the  expectation  of  Y   Y      for  q  =  l,2,...,N-n  and  m  =  0,  1,  2, . . . ,N-(n+q) 


given  YedY  is 

N-n   -1 
Z(A)  [L(A): 

0 


^(\^\^,Jle<^V    =  qn.(^-)  [^f)    (fzCX)  [L(A)]'""dA}' 


0°       N-(n+q+m)       A  m-1 

X  {/   [L(A)]         L"(A)  /   [L(0-L(A)]    L"(0  (3.10b) 

0  0 

C  q-l 

X   /  Z(w)[L(co)  -  L(C)]   dwdCdA} 
0 

Using  (3.10a)  and  (3.10b),  the  conditional  means  of  Y  , . . . ,Y„  and  their 

n      N 

correlation  structure  given  YedY  can  be  computed  numerically  to  any  desired 

accuracy  . 

That  (3.10a)  is  consistent  with  (3.4)  may  be  seen  by  setting  m  =  1  in 

(3.10a),  multiplying  it  by  the  density  of  Y  and  then  integrating  over 

Y.e[0,°°),  i  =  l,2,...,n  to  compute  the  marginal  expectation  E(Y   )  of  Y    . 

Namely, 

/„_  \  p/vj.-jN     oo    oo  n  °°       N-n-1       A 

V  1  "J  FN-  In  /  ••'/   n  Y  f(Y  )dY   /   [L(A)]     L"(A)  /   Z(C)dCdA. 

\  X   /     1(1^    u    ±J       ^  0  j  =  l  ^    ^    ^   0  0 

Integrating  first  with  respect  to  ^, 

X  n  p.     -Ab. 

/   Z(C)dC  =  I     ~   (1-e   J), 
o         j=l  j 

Then,  integrating  with  respect  to  Y_, 

.oo   n  n  P-i     -Ab, 


-    "   n  n   y-i     -AD. 

/  .../   [n  Yf(Y)][[  ^  (1-e   ^)]dY^...dY^ 


0    0^=1-^^   j=l  "j 


^  [l-L(A)]", 
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so  that  E(Y  ,,)  becomes 
n+1 


/  M  \  °°  N-n-1        n 

(n+l)(  ^J/  L"(A)tL(A)]      [l-L(A)]  dA . 


The  marginal  expectation  of  cross-products  Y  .  Y  .  ,   follow  directly 

n+q  n+q+m 

from  (3.10b):   setting  n  =  1,  Z((jo)  =  exp{-a)Y  },  so  upon  multiplying  (3.10b) 
by  the  density  of  Y  and  integrating  over  Y  e[0,°°),  (3.10b)  becomes 


/m  i\  /  .i_\   °°       N-(q+m+l)        A  m-1 

^      ^^^     °  °  (3.10c) 


q-1 
X  [1-UO]         L"(Od^dA. 

Using  (3.10c)  and  (3.4),  marginal  covariances  may  be  computed  to  any  desired 
accuracy. 
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Assertion  5:   Suppose  that  f  possesses  moments  of  all  order,  let  M,  denote 
the  kth  moment  of  f,  and  define  the  variance  of  f  as  V.   Then  the  sampling 
density  of  Y  possesses  the  asymptotic  expansion  for  large  p  =  N-n  and 
fixed  b.,  j=l,2,...,n: 

"  r/  J-,J.1^   n  Y.f(Y.  le) 

I   (Y)  n  Y.f(Y.|9)  =  {^^P"^t!^  n  ,1  J^  -  } 

N,n  -       J    J '-      r(p+l)       [pM^+b  ] 


X  {l+|pVg2(pM^,Y)  +|p2v^g^(pM^,Y)       (3.11) 
+  P(^f^^^2^  ^^^)g3(pMj^,Y)  +  0(p"  )} 


where  the  functions  g  (pM, ,Y)  are  given  by 

m        1  — 

n  an  12 


5   (pM    Y)   =       I  (pM  +b  )  +   [  I     (pM  +b  )      ]    ,                                              (3.12a) 

j=l         ^  j=l  ^ 

;    (pM     Y)    =    [    I  (pM  +b   )"  ]      +  3[   ^      (pM^+b,)"    ][    I      (pM.+b.)"']             (3.12b) 

j  =  l                 ^  j=l          -L     J  j=i          1     3 


+  2    ■    I      (pM  +b.)       , 
.1=1  ^ 

;^(pM^,Y)    =    [    I      (pM^+b    )       Ig^CpM^.Y) 


1    2      n 


+  3    [   I      (pM  +b    )      ]    [    I      (pM  +b    )       ] 
j=l  ^  j=l  ^     J 


22  n 


+   3    [   I  (pM  +b.)      ]      +  6      [      (pM  +b.) 
j  =  l  ^  j  =  l  ^      J 

n  in  _3 

+  6    [    [  (pM+b.)       ][    I      (pM+b.)       ].  (3.12c) 

j=l  ^      ^  3=1  ^ 
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Note  that  g  (pM, ,Y)  =  0[(pM, )   ].   We  further  expand  (3.2)  and  obtain  the 
mi  —         i 

asymptotic  expansion  for  large  N,  with  fixed  n,  given  explicitly  by 
n   [Y  f(Y  le)/M^]  X  {1  +^  [^(n+DM^M^  -M^   I     jY] 

i=i  -■   -"  j=i 


J=l   -^   ■'  j=l   -^  (3.13) 

+  |t  [|^(n-l)[3(n-l)'-(n+l)]  +  ^n^  (n+1)  VM^" 

+  n(n+l)(n+2)M^"  [-  ^^+  ^-^^2~  ^1^   "^  ^^(l-n^)^-^' 

+  •|n(n+l)(n+2)(n+3)V^M   +(n+l)VM^   ^b. 

+  ^[n(l-n)M   +  n(n+l)VM    ]   J  (nM  -b  ) 

j=l      -■ 


+M    I      (nM  -b  )(nM  -b.)  +  0(N   )} 
i<j  ^ 


This  last  expansion  is  of  limited  practical  value. 
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Part  II 
Lognormal  Superpopulatlons 

Expansion  (3.1])  for  the  sampling  density  Is  quite  general.   It  was 
obtained  as  "contribution  from  the  vicinity  of  y  ~  0"  without  further 
investigation  of  the  integrand  in  this  region.   For  a  fixed  small  n,  the 
expansion  (3. 13)  holds  for  f  lognormal  as  a  straightforward  specialization. 
As  n  grows  with  a  =  O(log  n)  the  second  term  of  (3. 1])  becomes  comparable 
to  the  first  term,  or  in  brisker  language,  the  expansion  breaks  down. 
Since  the  major  contribution  to  the  value  of  the  integral  (3.1)  is  from 
the  region  of  small  values  of  |y|,  it  must  be  studied  in  more  detail.   An 
asymptotic  expansion  of  (3.1)  uniform  in  this  region  is  required.   This 
is  not  surprising,  since  a  monte  carlo  simulation  of  sampling  without 
replacement  and  proportional  to  size  with  f  lognormal  yields  a  graph 
(Figure  1)  of  the  means  E(Y.),  j  =  1,2, . . . ,n, . . .  with  the  following 
qualitative  properties: 

(1)  E(Y  )  regarded  as  a  function  of  n  has  a  turning  point 

n 

at  roughly  n  ~  A,   i.e.,  the  mathematical  form  of 

E(Y  )  changes  character  in  moving  from  the  left  to  the 
n 

right  of  the  turning  point. 

(2)  To  the  left  of  the  turning  point  E(Y  )  looks  as  if  it 
decays  much  faster  than  to  the  right. 

These  properties  are  typical  of  a  change  in  the  nature  of  the  function 
E(Y  )  as  n  increases  and  are  confirmed  by  numerical  computation  of  E(Y  ) 
using  (3.4);  c.f.  Figure  3.   We  shall  construct  an  asymptotic  expansion 
exhibiting  them. 
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A  famous  example  of  this  phenomenon  was  discussed  by  Stokes  (1856) ; 

i.e.,  the  asymptotic  behavior  of  a  Bessel  function  J  (x)  when  the  order 

and  argument  x  are  both  large  and  of  comparable  magnitude.   In  our  problem, 

we  have  several  parameters,  namely,  \i,   a^ ,   n,  and  N.   The  asymptotic 

behavior  of  the  sampling  density  of  Y  and  of  the  conditional  moments  of 

Y   given  (Y  , . . . ,Y   )  =  Y    ,  depends  on  these  parameters  and  on  Y 
n         i      n-1    — n-i  — n-1 

in  addition.   Fortunately,  in  the  vicinity  of  the  turning  point  we  are 

able  to  approximate  the  density  of  Y  in  terms  of  an  integral  that  depends 

on  only  two  parameters,  each  of  which  is  a  function  of  p,  a^,  n,  Y   ,  and  p. 

— n— 1 

The  procedure  we  have  used  is  lengthy  and  consists  of  the  following 
steps: 

1.  Scale  out  the  Y-dependence. 

n 

2.  Integrate  the  Feynman  parameters,  replacing  ^  a.b.  with 

i=l  -^  ■" 
pM,   n    b.  -• 

1   V      1 
K  =  I  -j   ,  at  one  place.   (This  approximation  intro- 

duces  negligible  error  and  reproduces  the  leading  term  of 
(3.2)). 

3.  The  relevant  scales  are 

M^  =  exp{y  +  ^'}  =  O(p^). 

pM   n    b         s 


V  =  n^Hexp{o^}-l)    =  0(p) 
n  =  O(v^) 

2 

e^  =  0(.^) 
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In  terms  of  the  above,  define  new  parameters 
Q  =  [K+pMjVpV  =  O(p^) 


and 


6  =  (n-l)pV/[K+pM^]2  =  0(1) 


4.  Approximate  [G(y)]  . 

5.  Deform  the  contour  of  integration  of  (3.1)  by  90  and  approximate 

the  density  with 
n 


nQ''(P;^)   n  Y  f(Y  )/[b.+pM  ] 


(4.0) 


^0  icx. 

*  -J  {-/   exp{Q[6logz-z  -t^z^Jldz  +  /  exp{Q[6logz-z  +  %z^]}d5 

-iOO  Q 

Notice  that  the  integral  is  convergent  since  we  are  inte- 
grating along  the  negative  imaginary  axis. 

6.   Asymptotically  evaluate  (4.0).   Notice  that  the  second 
integral  is  the  complex  conjugate  of  the  first  and  we 
thus  calculate  only  one  of  them. 


That  we  can  approximate  with  negligible  error  integration  over  the 
n  n 

manifold  A  =  {a. |  ^  a.  =  1,  a.  >  0}  by  replacing  ^   a.b.  with  K  as  defined 
J  j=l  J       J  j=l  J  J 

J  n      -" 

tep  2  requires  explanation.   As  b  <  Y  a.b.  <  b,  everywhere  in  A, 

J=l   •' 


in  s 
n 


J  a.b.  is  bounded  and  it  is  not  unreasonable  to  expect  that  in  the  leading 


term  of  an  asymptotic  expansion  of  (4.0),  I   a.b.  may  be  replaced  by  a 

j=l  ^  ^ 

constant.   Intuition  suggests  an  equal  weighting  of  the  b.'s;  i.e.,  replace 

n  -t      ^ 

}    a.b.  with  —  y  b..   In  fact,  in  the  limit  p-><»,  this  is  the  appropriate 
^=2^  J  J      "  i=l  ^ 

choice.   However,  K  as  defined  in  step  2  is  an  Improvement. 
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The  integral  I„   (Y)  of  (3.0)  is  proportional  to 
N  ,n 

oo  ioo        n       _i 

J  E  -^  /   [L(X)]P  /  exp{Az}  n  [b.+z]   dzdX  (4.1) 

2^^  0        -ico       j=i  ^ 

with  L(A)  the  Laplace  transform  of  the  lognormal  density.   An  asymptotic 
expansion  of  (4.1)  may  be  performed  in  two  ways:   (1)   double  integral 
steepest  descent  or  (ii)   use  of  Feynman  parameters  together  with  double 
integral  steepest  descent. 

First  (i).   The  steepest  descent  equations  for  X   and  z  are 

K   =  I     [b  +^*r'  (4.2a) 

j=l  ^ 

and 

z^  =  pM^L(A^e'^')/L(A^),  (4.2b) 


Implying  a  steepest  descent  point  X^  satisfying 

K=     I      [b.  +  {pM  L(X^e^  )/L(X^)}]"  (4.3) 

j=l   ^ 

Starting  at  X  =  J^      [b.+pM^]   ,  direct  iteration  gives  very  fast  convergence. 
j=l   ^ 

Alternately,  rewrite 

J  1  n  °°  i°°  n 

J  =  iSzlIL  /   .../     da^...da  6(1-  I  a.)   j     /     exp{Xz-nlog(  I  a.b.+z) } [L(X) l^dXdz . 
^"^^0         0-^  "         j-1  ^        0  -i"  j=l  ^   ^ 


If  we  replace  ^  a.b.  bv  a  constant  K  and  then  do  double  integral  steepest 
j=l  J  J 

descent,  the  steepest  descent  equations  are 

X  =r^  (4.4a) 

o   K+z 
o 


and 

C^ 

z 


=  pM.L(X  e^  )/L(X  ).  (4.4b) 

o     1   o        o 
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The  two  pairs  of  steepest  descent  equations,  (4.3a),  (4.3b)  and  (4.4a), 

(4.4b)  imply 

pM   n    h . 

K  =  -^  J  ^-i^  (4-5) 

n   /.,  b.+pM,  ^    ' 

J  =  l  2        1 

2 

To  see  this,  let  p(A^)  =  L(X^e   )/L(X^)  and  rewrite  (4.3)  as 

For  large  p,  p(A^)  =  1  and  z^  =  z.  =  pM  so  that 

T   n    b.     1 

\  =  -^  [1+  i  y  — 1— r 

*   pM,  ^   n  /,  b.+pMj 

pM   n    b .      1 
.„[p„^f^  J^^),-  =^,  (4.6) 

3=1   1    1        * 

The  error  induced  in  the  leading  term  of  an  asymptotic  expansion 

of  J  by  replacing  2.  Cb.  with  the  constant  K  is  0(p  ).  We  do  so  in  what 

j=l  ^  ^ 
follows: 

Assertion  6:   For  large  p,  the  density  of  Y  possesses  an  asymptotic  expansion 

n 

uniformly  valid  for  6  finite  and  positive;  it  is  of  the  form  I    (Y)  H  Y.f(Y  ) 

N,n  -    .^   2        2 

where  for  6  <  1/4, 

N,n  —      p.    (n-1).  2        1 

X  |aQ|{Hi(|T_Q|''"^)  -  Q-'^^CQHi'(|T_Q|'^3)}  (4.7a) 


where 


|T_|  =  ||i  TTU  -   Slog  ^±^Si  I  ,  (4.7b) 

^  ^  1-/1^46 


IqI  =  (26)^1  T_|''"^l-45)  ^  (1  +  2/5)  *,  (4.7c) 


and 


For  6  >  1/4, 
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=0  -  (  i^  )*|TJ-^'.  (4.7d) 

"     1+2,^ 


A  =  i  6log6  -  i  -  i  6,  (4.7e) 


Hi(+  x)  =  /  exp{-  yt^f  xt}dt.  (4.7f) 

0 


.n-  H   „ .   n 


where 


:  IIqKhK-t^qI'''^)  -  q-'^'^IcqIhi'Mt^qi'"^)}  (4.7g) 

|t_|_|  =  ||25  arctan  A5^  -  |-A6^I  (4.7h) 

la^l  =  (26)^It^i'"'^(46-1)"^(1+2A)"^,  (4.7i) 


and 


"        I+2A      ^ 

The  above  formulae  connect  at  6  =  1/4: 

i,_„(i)  =.  6-\an)  -^  ^  .'>"  .ybj^H^]-"  (4.7k) 
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5.   Uniform  Asymptotic  Expansion  of  Density 

Asymptotic  evaluation  of  Integrals  of  the  form  (4.0)  Is  usually  done 

by  the  method  of  steepest  descent.   This  procedure  requires  that  we  locate 

the  zeros  of  the  derivative  of  f(z)  =  6log  z  -  z  +  yz^.   In  particular,  we 

obtain   f'(z)  =  1+z  =  0  with  two  solutions  ~  +  k   [1-46] ^   From  this 

z  I  -   I 

it  is  clear  that  within  the  range  of  desired  6s  (0  <  6  <  1)  the  two  roots 
come  close  to  one  another,  coalesce  at  6  =  t>  and  become  complex  for 
^   >  -T'      This  means  that  for  6  =  t^  the  major  contribution  to  the  integral 
comes  from  the  vicinity  of  z  =  —  which  is  the  vicinity  of  the  two 
coalescing  saddle  points.   We  explicitly  see  why  in  this  region  the 
asymptotic  expansion  (3. H) breaks  down. 

For  n,a^  fixed,  N-x»,  (3. 13)  is  valid,  provided  N  is  sufficiently  large. 
However,  in  our  particular  problem,  n  and  a^  can  assume  values  such  that  the 
value  of  N  required  for  (3.1^  to  be  valid  is  beyond  the  range  of  feasible 
values  of  N.   In  particular,  we  are  Interested  in  values  of  N  such  that 
log  N  is  "not  too  big."   Throughout  the  analysis  we  treat  N-n  as  a  large 
parameter,  which  means  that  we  exclude  values  of  n  =  0(N).  When  n  =  0(N) 
the  play  is  nearly  exhausted  and  we  shall  not  deal  with  this  case  here.   It 
is  clear  that  (3. 13)  breaks  down  in  the  following  region: 

n  =  O(p^),  M^  =  O(p^). 

Furthermore,  if  all  Y.'s  were  equal,  \   jY.  =  0(n^M  )  =  0(p  **)  ,  so  we  allow 

?  5^        ^  j=^   ^  1  1     1  " 

),  jY.  =  0(p  **).   The  probability  that  the  term  in  (3.  1^,  -^(n+l)M  M  -  M   /  jY. 
j=l   ^  2       2  11  .^Q   J 

is  zero  is  negligible.   Thus  we  have  to  evaluate  (3.1)  in  this  region  in 
order  to  obtain  a  uniform  asymptotic  expansion  for  large  p.   Note  that  (3.1) 
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was  essentially  obtained  as  half  the  difference  of  two  integrals,  and  we 
wish  to  keep  this  in  raind. 

In  determining  (3.11)^  we  made  use  of  the  fact  that  the  major  contri- 
bution to  the  density  comes  from  the  vicinity  of  y  ~  0  with  y  being  in  the 
lower  half  of  the  complex  plane.   This  is  still  true  here,  but  further 
study  of  the  structure  in  this  vicinity  is  necessary. 

First,  consider  G(y)  for  y  =  ^  -  iri  with  E,   and  n  both  small: 

CO 

■   G(y)  =  — ^—  /  exp{-nx-  i(logx-y)2}  ^  e"^^^  (5.1) 

v^^2   0  ^  "" 

~i^x  1  ?  ? 

Approximating  e      by  1-i^x  -  — ^  x  ,  due  to  the  effect  of  exponential 

—fix 
factor  e    the  resulting  three  integrals  do  not  grow  as  fast  as  with  n  =  0- 

Define  for  j  =  0,1,2,. .. 

CO 

m.(n)  =  /  exp{-r|x-  ^(logx-y)^}  ^  x^  (5.2) 

J      /2^2   Q  2  X 


We  now  wish  to  approximate  [G(y)]   using  the  quantities  m.(ri),  j  =  0,1,2. 

[G(y)]P  =  exp[plogG(y)] 

1  9 

-   exp[p  log(mQ(ri)  -  i^m^(ri)  -  '2^  m^(T])] 

=  exp[p{log  mo(n)  -  iC  ^;^  -  2^H  i;;^  -  ^-r^  ))]  (5.3) 

In  (5.2),  m_(ri)  is  the  LaPlace  transform  L(ri)  ,  and  the  rest  of  the  m.(r|)  can 
be  expressed  as  derivatives  of  m  (n) .   The  behavior  of  [G(y)]^  for  very  small 
y  is  studied  in  great  detail  in  Barouch  and  Kaufman  (1976),  where  it  has  been 
shown  that  replacement  of  m.(ri)  by  M.  is  a  reasonable  approximation  for 
log  n  =  — X —  o^.   Hence,  the  approximation  for  [G(y)]^  takes  the  form  [y  -   ^] 


32 


[G(y)]P  -  exp{p(-iM^y-  ^  Vy2)  (5.4) 

To  improve  (5.3),  one  may  use  more  terms  in  the  expansion  of  G(y) .   In 
particular,  formula  (4.5),  (1976)  op.cit.,  is  uniformly  valid  for  y  -  0. 
Formula  (3.1)  now  takes  the  form 

T   (Y)  =  ^P'^^  /  •••/  6(1-  y  a.)da,...da 
N,n-     p!   ^^  i^  .^^  ,   1 

OO 

X  -|[i"  /  y""-^exp{-iy(K+pM^)  -  -pVy^ldy 
0 

— OO 

+   (-i)"  /  y"~^exp{iy(K+pMj^)  -  |pVy2}dy]  (5.5) 

0 

where  K  =  Z .a.b . . 
J  J  J 

-1  -1 

Let  y  =  (K+pM^ )   z,  and  scale  out  a  power  of  (K+pM^ )   .   We  further 

1  " 
approximate  K  as  —  )^  b .  in  the  resulting  coefficient  of  y^  and  use  Feynman's 

^  j  =  l  ^ 
identity  once  more  to  integrate  the  power.   Hence  we  obtain 


\  Jl)    =  n(P!^)   n  [b.+pM  ]  ' 
N,n—       n    ..il    1 


*  |[i"  /  dz^zf  ^expMz^  -  I  (j,P^  )J  z^')  (5.6) 


1' 


— OO 

,  /  .^n  f      n-1    r.      1   pV       2  11 
+  (-i)  j^   dz2Z2   exp{iZ2  -  2  (K+pM^)^  ^2  ^^ 


Both  integrals  are  clearly  convergent,  and  we  deform  the  contours  to 
z   =  -iz.   and   z„  =  iz_   to  obtain 
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-loo 


I    (Y)  =  n(^"^)   n  [b  +pM  1  ^  *  -rf   z"  "^  ext>(-z  +  — ^ r  z  ^Idz 

N,n-'     ^  n  '    ,^^^    ]^   l'  2^-'^   ^3    ^^   3   (K+pM  )2  ^3  ^3 

ioo 

+  /   z^"W(-Z3  +^  (K+pM,)^  ^^''}dz^].  (5.7) 

o  '^  1 

Define  the  following  parameters 

(K+pM  ) 2      , 
Q  =   pv     =  O(p^)  (5.8) 

Note  that  6Q  =  n-1  =  integer.   We  scale  z  once  more,  defining 

z^  =  Qz  (5.10) 

and  (5.6)  takes  the  form 

i„  Jl)  =  ri(^T^  (f    n  [b.+pM  ]"' 

N,n  n  ,    ,      1        i 

—  XOO 

*  I   [/     exp{Q(6log   z  -   z  +  |z2)}dz  (5.11) 


loo 

+  /   exp{Q(6log  z  -  z  +  •iz^)}dz] 
o 

The  next  step  is  asymptotic  evaluation  of  the  two  integrals  in  (5.11)  for 

large  Q  and  fixed  6.   The  relevant  range  of  5  is  0  <  6  <  0(1),  which 

incorporates  6  =  1/4.   The  standard  method  of  steepest  descent  is  not 
applicable  here  since 

f'(z)  =  3-  (6log  z  -  z  +  ^2)  =  -  -  1+z  =  0  (5.12) 

dz  Z      z 


has  two  solutions 


z+  =  +^t\[l-^S]^  (5.13) 
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Thus,  when  5  =^  1/^,  the  two  solutions  are  very  close  to  one  another,  and 
both  contribute  to  the  value  of  the  Integral. 

The  problem  of  coalescing  steepest  descent  points  has  been  dealt  with 

by  many  authors.   In  particular,  we  follow  Chester,  Friedman  and  Ursell  (1956) 

* 
and  Bleistein  and  Handelsman  (1975)  ;  BH  (1975)  provides  a  systematic  method 

for  uniquely  determining  the  phases  required. 

The  basic  idea  lies  in  a  change  of  variables: 

6log  z  -  z  +  yz^  E  -^u^  -  (;u+A  (5.14) 

where  6and  A  are  constants  to  be  computed.   This  transformation  has 

three  branches  in  the  complex  u  plane,  but  only  one  is  analytic.   More 

precisely,  one  of  the  three  branches  is  a  conformal  mapping,  namely  —  ^  0. 

Assuming  we  select  this  branch  correctly,  and  take  derivatives  of  (5.14), 

we  obtain 

(^-  1+z  )  =  (u^-C)  (^)  (5.15) 

z  dz 

% 
as  z  ->■  z   and  u  ->■  +^  . 


Following  CFU(1956),  we  obtain 


A  =  |t5log5-  -|  -  itS  (5.16a) 


and 


^'-1 


rl    +    /1-46-,  1 


-Ifilogf"-         _"-  ^   }  -  f/l-46|    s  T_  for  46   <   1 
1  -  /[^ 

-i|26   arctan/46-1  -  |-/46-l|    e  t_^  for  46   >  1 


(5.16b) 


\ 


* 

Henceforth  we  denote  these  two  references  as  CPU  (1956)  and  BH(1975) , 

respectively. 
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One  now  needs  to  determine  the  phase  of  C^.   This  is  done  by  use 

* 
of  the  method  described  in  B(1975): 

r 


M 


< 


T_ 

\-ii^/3 

1st 

integral 

T_ 

H^iTr/3 

2nd 

integral 

T+ 

H   -i7T/6 
e 

1st 

integral 

T+ 

^    iTT/6 

e 

2nd 

Integral 

(5.17a) 


(5.17b) 


dz 
Furthermore,  following  CFU(1956)  we  approximate  -r—   near  z  =  z.   as 

du  + 


d^  ^  ^0  -^  ''o" 


(5.18) 


Next,  in  order  to  compute  a^  and  b_  for  the  four  integrals,  we  take  the 
derivative  of  (5.15)  with  respect  to  u  to  obtain 


d=^u 


,du. 


2"  =  <J  -  1-^-)  I^  ^  (1  -  I^)  ^Z^ 


h 


In  (5.20)  one  may  substitute  +?  and  z  respectively,  and  the  first 
term  drops.   So  we  have 


(5.19) 


Vlxt'   =  (1  -  Cz^')  (Hq  +  Aq)2 


(5.20) 


The  last  difficulty  occurs  in  taking  the  square  root  of  (5.21).   This 
is  done  uniquely  by  direct  computation  of  the  phase  of  du/dz,  as  demon- 
strated in  BH  (1975) .   For  z  near  z   ,  one  may  write 


Throughout  this  analysis  the  subscript  +(-)  means  6  >  1/4  (5  <  1/4) , 
respectively. 
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J    z ,  -  z 

This  determines  the  phase  uniquely.   The  rest  is  brute  force,  and  the 
results  are  explicitly  given  in  (4.7). 
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6.   Connection  of  (3.11)  to  the  Uniform  Expansion 

To  show  that  the  uniform  expansion  (4.7)  connects  to  (3.11),  we  perform 

a  small  6  expansion  for  the  approximation  (5.0)  to  the  density  of  Y.   That 

Is,  we  perform  an  asymptotic  expansion  of 

n  n    °° 
J  =  (n-i);  /  g^^  Q[5logz-iz-  |z^]  }dz  (6.1) 

for  large  Q  and  fixed  small  "5  <  t-  •   The  two  steepest  descent  points  are 

^1,2  =  2  [l+d-^-S)*]. 
For  small  6, 

z^  ^   -1(5+6^)  and  z^  ^   -1(1-6-6^), 

and  only  z^  contributes.   Thus, 

n  n  °° 

J  ^  T^^^JV   exp{  Q(-iz^  -  |z^2  +  6iogz^)}  /  exp{^  Q[-l  -  ^  ](z-z^)2}dz 

0  1 

=  j^^^   exp{  Q(-  ^[l-(l-45))^]  +  I  [l-(l-46)^]'  +  6log[^  -  ^   (1-46)^]} 

X  {2ttQ"\6[|  -  -|(l-46)^]"'-l)"'}"^ 

Since  6Q  =  n-1,  we  take  the  limit  Q-**",  n-»<»,  6^0,  and  write  for  (n-l)I  its 
Stirling  approximation.   After  manipulation 

J  ^^^.i^i    (2TT5/Q)'exp{-Q6  +  ^  Q6^  +   Q5log6}  .  (6.2) 

Ap  p  r  ox Ima  ting 

Q-'  ^^  {  [  (b.+pM-)-'}2=^g2(pM-.Y). 
n   i-i     J    ■•-       n   ^   X 


(6.3) 
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where  g-  is  defined  by  (3.12a), 

J  =  (1+^)  exp{|Q"'(n-l)2}  -  l+|pVg2(pM^,Y),  (6.4) 

the  first  two  leading  terms  of  (3.11). 

In  sum,  the  approximation  to  the  density  of  Y, 
,n     n   Y,f(Y  |e) 

r  n     -J i 


p.     Q     (n-1).     j^   lb  +pMj^J       2   2 


X  exp{Q(-6-  |[l-(l-46)^]2  +  61og[|  -  |(l-46)^])} 

is  an  improvement  of  (3.11)when  the  conditions  of  assertion  1  hold. 

To  demonstrate  connection  of  (3.1])  to  the  uniform  expansion,  we 

use  (6.2).   In  the  uniform  region  we  h^ve  0(n)  =  0(e  )  =  0(p  ),  and 
for  6  <  -^  , 


with 


It  I  =1  (1(1-46)^ -6iog(l±^i:^)} 

^  ^  l-(l-46)^ 


and 


oo 

Hi(Z)  =  /  exp{-  ^^+Zt}dt,  |argZ|  <  ^ 


(6.5) 


Vn^^>=^7^^^'»^^lQ^-|''^>  l%l  ^'-'^ 


|a J  =  (26)^|T_|^«(l+26^)"^  , 


A  =  -^tSlogS-  4  -  2^  » 


■2y, 
Note  that  as  6->0,  |t_|  is  finite,  hence  |QT_|  ^  is  large.   Asymptotically 

(ZE|QT_|  ''~°°)  we  have 
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Hi(Z)  ~  (f)^{l^(|  z'"^)  +  I_i,^(|  Z^2){A"^exp{|  ■l'^-)  (6.7) 

where  Ii   is  a  modified  Bessel  function. 


Thus,  for  small  6,  (6.5)  takes  the  form 


%P,n-  y-i  _  1^ 

J  =  (^.r)\    '^-^1   'exp{Q[^log6-  ^  -  ^]} 


X  6^  *^  |T_|^^exp{|Q[i(l-26-262)  +  6(log6+25)]}  (6.8) 


=  (^)  -(^^?3jr  exp{-6Q+Q6log6+  ^^q} 


which  is  (6.2)  exactly.   Since  (6.2)  connects  to  (6.5)  as  well  as  to 
(3.  11)  and  to  the  uniform  expansion  for  6  =  -r-  .  we  may  view  (6.2)  as  an 
intermediate  step  in  the  connection  of  (6.6)  to  (6.5)  and  to  (3.11). 
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7.   Approximation  of  Conditional  Expectations 

In  order  to  motivate  a  procedure  for  calculating  an  approximation 

to  E(Y  .,|YedY),  we  show  that  the  density  of  Y  =  (Y,,...,Y  )  given  that 
n+1    ~  —In 

n+1  discoveries  are  known  to  have  been  made  but  that  values  of  only  the 
first  n  will  be  observed  is  identical  to  the  marginal  density  of  Y. 
The  density  of  Y  given  that  (n+1)  discoveries  have  been  made  is  (suppressing 
display  of  6^  in  f  (•  |  9)  )  . 

Doing  a  partial  fraction  decomposition  using  the  identity 

S'^b.  +  S]  ^  =f-  /  (1  -  e   J)e'^^dA,  (7.2) 

^  J   0 

(7.1)  becomes 

'  r  ,   n  Y.f(Y.)dY.  /  [  );  r^  (1  -  e   J)][L(X)]''  "  ^-L'(X))dX.     (7.3) 
r(N-n)  .^^3    J    J  ^Q  .^^  b. 

An  integration  of  (73)  by  parts  shows  it  to  be  Identical  to  (1.1a),  the  density  of  Y. 

Let  h(Y,Y  ,,)  denote  the  joint  density  of  Y  and  Y  ,.  and  let  g(Y) 
—  n+i  —      n+i  — 

denote  the  density  (7.4)  of  Y  given  that  n+1  discoveries  are  made.   The 

conditional  expectation  Y  .^  given  Y  e  dY  is 

—n+i       —    — 

oo 

U\,,\l.m-^ ^^ (7.4) 

If  we  express  the  density  of  Y  , . . . ,Y  ,  Y  ^  in  the  form 


n 


-1    V   ^L  "  -1 

=  ^  — ^  with  p  =  n  (b.  -  b.)  ■", 

^    i=l 


j=l  ^  j=l  ^j^^  ""    -^   ^ 
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since  each  partial  sum  (Y.+. ..+Y  +Y   )  contains  Y  ^,   the  partial  fraction 

coefficients  Involve  Y, , . . . ,Y  ,  but  not  Y  ,,.   That  is,  defining 

1      n  n+1  ^ 

b!  =  Y,+...+Y  ^-  p!  =   n  [b'-b']    for  j=l,2, . . . ,n+l ,  the  p!s  depend  only 
1    1      n+1  "^J    ^^1   1  j        J   .  .    .    .     Kj 


1=1 
±^3 


on  Y,  , .  .  .  ,Y  . 
1      It 


Consequently,  defining  b  ^^  =  0  we  may  write  the  conditional  expectation 


of  Y  ,,  given  Y  e  dY  as 
n+±       —    — 


-J r(N+l) 

g(Y)dY  r(N-n+l)  .1!^  ^j^^^j^'^^j 


°°  n+1     -Ab.    «>  -Ay  ., 
X  /   [  I  p:  e   J]  [/  e   --^1 

0   j=l  ^  0 


f(Vl>  ^n+l^Vl^  [L(X)]''"'dA 


N-n-1  n+1     -Xb. 


^f(iS)".!!  ^j^^^j^  /  L"a)[L(X)]     [J     p-  e   J]dX 


(7.5) 


Approximating  L"(X)  by  writing  it  as  L' (X) [L"(X)/L' (A) ]  and  setting 


M   -X(M  /M  ) 
L"(A)/L'(A)  -  rp  e 
"l 


(7.6) 


we  have 


E(\+i|YedY)  =  [(N-n)  (M^/M^)] 


00 

'n+l            -A(b. 

+  (M^/MJ)"' 

N-n- 

-1 

i 

I   p:  e      J     -   ^ 

.1=1     J                                   J 

L 

(A)L'(A)dA 

0 

00 

r  n             -Ab." 

N-n 

/ 

I     V,e       ^ 

L        (A)dA 

0 

Lj=i    ^          J 

(7.7) 


or  in  terms  of  h  and  g. 


M    h(Y,M  /M  ) 


g(I) 


(7.8) 


4? 


The  same  result  follows  from  a  similar  treatment  of  the  complex 
integral  representation  (3.1)  of  the  density  of  Y:   the  conditional  expec- 
tation of  Y  ^.    given  Y  G  dY  is 
n+i 


«    n+1 


(N-n) 


/  5(1-  I   a  )da  ...da   /  y"(G(y)]   "   /  exp{-iy(  j   a.b.  +  A  ^,)]A^^  f(A  ^,)dA  ^  dy 
0    j=l  J    ■■■     ^^^   0  0        j  =  i  -1  J     "■'"1   """"l   ""^1   n+1 


n+1 


/  6(1-  I   a,)da  .  ..da   /  y"[G(y)]   "   /  exp{-ly(  Y  a.b.  +  A  _^  ^  }A  ^,f(A  ^JdA  ,  dy 
0    j=l  -'  0  0        i=l  -^  -^  "  -^   "^   ""'"-'^ 

{7.9) 


(N-n) 


oo     n+1  *"        vr_   1  ^ 

I     6(1-  y  a.)da,...da  ^,  /  y'^[G(y)]^  "■V(y)exp{-iy  I   a  b  }dy 

0     1=1  ^   ^     "^  0   1=1  ^  ^ 

oo      n+1  °°        XT    1  '^ 

/  6(1-  I   a  )da^...da^^^  /  y''[G(y)  l^'^^'-^G' (y)exp{-iy  I   a.b  }dy 

0     j=l  ^  "    o  j=l 


Write  G"(y)  =  G' (y) [G"(y)/G' (y) ]  and  approximate  G"(y)/G'(y)  as  follows; 


M  M 

2  -1  2 

G' (y)  =  -iM  exp{-i  X7~  Y^  ^   -iM  e  exp  {1-i  Try) 


so  that 


-1  ^2 

-iM  e   exp{exp{-i  —  y}} 

1 


i|^logG'(y)  ._exp{-i-y} 


Upon  making  this  substitution,  (7.9)   is  seen  to  be  identical  to  (7.7) 
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8.   The  North  Sea  -  Some  Preliminary  Computations 

An  indispensable  ingredient  of  a  predictive  study  of  world  oil  markets 
is  a  method  for  forecasting  the  number  of  undiscovered  fields  and  their  sizes 
in  each  of  the  world's  major  producing  regions.   The  North  Sea  petroleum 
province  is  clearly  of  key  importance,  and,  as  part  of  a  longer  term  project, 
the  M.I.T.  World  Oil  Project  has  begun  a  study  of  it. 

From  February,  1968,  when  the  first  North  Sea  Field  was  discovered,  to 
late  1975,  60  fields  were  found  in  the  North  Sea.   The  sizes  of  these  60  fields, 
measured  in  recoverable  barrels  of  oil  equivalent  and  plotted  in  order  of  dis- 
covery, are  displayed  in  Figure  5  .   In  fact,  this  sample  is  a  mixture  of  at 
least  four  distinct  geological  populations  or  plays:   Chalk,  Jurassic  North, 
Jurassic  Central,  and  Tertiary. 

As  a  first  step  we  chose  to  treat  the  entire  North  Sea  as  a  single  play. 
Results  of  some  preliminary  calculations  are  presented  here.   They  give  the 
flavor  of  what  a  more  detailed  study  may  reveal,  and  emphasize  the  importance 
of  "blocking"  the  data  into  geologically  homogeneous— units  when  this  is 
possible. 

Aside  from  Statfjord  (observation  number  32)  an  eye-ball  inspection  of 
Figure  5  indicates  that  as  n  increases,  the  graph  of  size  of  discovery  tends 
to  decrease,  although  fluctuations  remain  large.   The  "within"  play  varia- 
bility of  a  graph  of  sizes  of  discoveries  is  substantially  smaller;  e.g.. 
Figure  9  shows  such  a  graph  for  Juressic  Central,  where  only  ten  fields  have 
been  found.   Figure  6  shows  sample  fractiles  for  the  North  Sea  as  a  whole 
plotted  on  lognormal  probability  paper.   This  graph  displays  a  characteristic 
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feature  of  sampling  without  replacement  and  proportional  to  random  size 
when  the  underlying  superpopulation  is  lognormal:   the  right  tail  appears 
fatter  than  lognormal. 

Using  the  uniform  expansion  (4,7)  for  the  density  of  n  =  60  North  Sea 
discoveries,  we  computed  a  series  of  iso-contour  graphs  of  the  likelihood 
function  for  (\i,o^)    pairs  given  fixed  N;  Figure  8  is  typical.   Each  such 
graph  we  computed  appeared  unimodal  in  (\i,a^) ,   and  relatively  "flat"  in  the 
neighborhood  of  the  maximum.   Table  II   is  a  set  of  summary  statistics, 
showing  how  a  maximizer  of  the  likelihood  function  approximated  by  (4.7) 
behaves  as  a  function  of  N.   Since  (4.7)  is  designed  for  n  exp{a^}/N-n  =  0(1), 
it  behaves  best  for  N  near  300  . 

Figure  7  displays  the  graph  of  the  likelihood  function  for  N  with  fixed 
VI  =  5.70  and  o^  =  1.36.   Function  values  L(N|y,a^  ,data)  ,  N  =68,...,  358 
were  computed  numerically  using  a  Rhomberg  routine  to  evaluate  L(A)  at  100 
points  .001(. 01)1. 001  to  25  digits  accuracy.   The  function  Z(A)  was  computed 
to  100  digits  accuracy  using  the  M.I.T.  Mathlab  MACSYMA  software  system. 
External  integration  of 

oo 

/  Z(X)[L(A)]^"''dA 

0 

358 
was  done  using  a  modified  Simpson's  rule.   The  sum  Z  L(j |u,a^ ,data)  was 

j=68 
computed  and  used  as  a  normalization  factor. 

Values  y  =  5.70  and  o^   =  1.36  are  maximizers  of  L(y ,a^ [N,data)  for  N=300. 
Since  60exp{l. 3  6}/240=.97,  the  uniform  expansion  (4.7)  is  a  valid  approxi- 
mation to  the  density.   However,  the  likelihood  function  for  N  given  these  values 
of  y  and  a^        is  maximized  at  N  =  99,  so  N  =  300,  y  =  5.70,  a^   =   1.36  is 
clearly  not  a  maximizer  of  the  joint  likelihood  function  for  N,  y,  and  a^. 
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A  similar  calculation  for  the  Jurassic  Central  play,  using  the  repre- 
sentation  (3.2)  of  I    (Y)  was  done  by  John  Nelligan,   Figure  10  is  the 
normalized  likelihood  function  for  N  given  y  =  4.778  and  o^  =  1.264.   Both 
Figure  7  and  10  may  be  interpreted  as  approximate  posterior  probability 
functions  for  N  given  y  and  o  and  that  a  diffuse  prior  probability  function 
has  been  assigned  to  uncertain  N. 

The  right  hand  side  of  Figure  5   shows  a  graph  of  conditional  expec- 
tations E(Y  |y  _  , . . . ,Y  )  for  n  =  61,..., 96.   They  were  computed  sequentially 
using  the  relatively  easy  to  compute  approximation  (7.8  ):   first  E(Y^, | Y,_ , . . . ,Y  ) ; 
then  assuming  that  Y   =  E(Y,^  | Y,_, . . .Y  )  E(Y,  |y,  , . . . ,Y^ )  was  computed,  etc. 

With  prevailing  cost,  price,  and  revenue  sharing  agreements,  the  ninimum 
economic  field  size  in  the  North  Sea  is  250  million  barrels  of  recoverable 
oil  equivalent  (BOE) .   It  is  improbable  that  price  relative  to  cost  vrill  rise 
enough  during  the  next  five  years  to  make  any  field  with  less  than  100  million 
BOE  economically  viable.   For  fields  larger  than  250  million  BOE,  the  rate  of 
development  and  extraction  of  reserves  may  also  depend  on  field  size. 

As  a  first  step  in  the  economic  analysis  of  additions  to  reserves  from 
new  discoveries  (in  10^  BOE) ,  the  interval  [100,0°)  was  divided  into  four  sub- 
intervals  I  ,...,I,  as  shown  in  the  column  headings  of  Table  HI.   Then 

P{Y^- ,,  el . Iy,^, . . . ,Y, }  and  the  partial  expectation  of  Y,^,,  in  I.  was  computed 
60+k  1 '  60      1  60+k     i 

for  k  =  1,...,36  and  i  =  1,2,3,4.   A  sample  of  these  quantities  are  shown  in 
Table  III.  The  probabilities  in  particular  give  a  feel  for  the  variability  of 

^60+k  s^^^"  ^60"--'^r 

In  a  subsequent  paper  we  will  present  methods  of  parameter  estimation, 
of  forecasting  and  numerical  procedures  for  computing  the  likelihood  function, 
moments  and  the  correlation  structure  of  the  model. 


* 

As  part  of  his  Ph.D.  thesis  currently  in  progress, 
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Table  II 


Approximate  Maximum  Likelihood  Estimates  of  y  and  a 
For  N  Fixed  Computed  Using  Uniform  Expansion  (4.7) 


N 

y(N) 

S'(N) 

80 

5.90 

.87 

100 

5.82 

.93 

120 

5.81 

.98 

150 

5.77 

1.06 

180 

5.75 

1.13 

200 

5.74 

1.17 

250 

5.71 

1.27 

300 

5.70 

1.36 

400 

5.66 

1.52 

nexp{a^(N)}/N-n 

Log 
Likelihood 

7.16 

-53.38 

3.80 

-59.24 

2.66 

-63.24 

1.92 

-68.07 

1.55 

-72.18 

1.38 

-74.67 

1.12 

-80.21 

.97 

-85.03 

.69 

-93.15 

Table  III 


Probabilities  That  a<Y  <b  given  Y  ,,.,.,¥.  For  if=61,  .  .  .  ,96, 

n         n-i      i 

Partial  Expectations  in  Parentheses 
Intervals  (a,b) 


I 

II 

III 

IV 

N 

125-250 

250-375 

375-500 

Over  500 

61 

.2408 

.1401 

.1112 

.4226 

(35.5) 

(47.9) 

(55.9) 

(365.5) 

70 

.2374 

.1396 

.1115 

.4283 

(31.9) 

(43.7) 

(51.5) 

(321.4) 

80 

.2346 

.1393 

.1117 

.4331 

(31.6) 

(43.3) 

(51.2) 

(317.5) 

90 

.2324 

.1389 

.1119 

.4368 

(27.2) 

(38.0) 

(45.7) 

(264.5) 

96 

.2312 

.1388 

.1120 

.4387 

(26.2) 

(36.7) 

(44.4) 

(252.8) 
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Appendix 


A.l   Computation  of  (2.1)  through  (2.10) 


The  moment  formulas  given  in  Section  2  are  easily  computed  using  two 
facts:   first,  the  integral  of  (1.1)  over  Y.  e  (0,°°),  j=l,2,...,n  is  one, 
and  second,  expectations  of  arbitrary  powers  of  the  Y.'s  can  be  represented 
in  terms  of  an  integral  with  integrand  of  the  form  of  that  in  (1.1).   To 
illustrate  we  evaluate 

00    CO      j\  —  Xy 

E(Y^)  =  w?4tT  /  •••/  Y*"  {  n  XY.e   ^  [b  .+S]"'dY. } 
n    r(N-n+l)   0    0     j=l  ^       "■       ^ 

.  -XS,,  .N-n-1 

^     r  M  \ dS 

r(N-n) 

Since  the  last  term  of  b.,  j=l,2,...,n  is  Y  ,  if  we  let  bl  =  Y.+. ..+Y  , 

2  n'  3    J      n-1 

and  define  U  =  S+Y  , 
n 

oo    00  n~l     —  Xy 
g^~k^  ^  r(N+l)r(k+2)  J    ^j      ^^     ^Y.e    ^[b:+U]-'dY.} 

"    r(N-n+l)x''    0    0  j=l   ^       J       ^ 

T.^-^U,,.,.  (N-n+k+l)-l 

r(N-n+k+l)      "*" 

By  analogy  with  (1.1)  the  value  of  the  integral  above  is  r(N-n+k+l) /r(N+k+l)  so 


r^fyK   =  r(N+i)r(N-n+k+i)  r(k+2) 

•^^  n''    r(N+k+l)r(N-n+l)    ,k 


X 
Formulae  (2.2)  through  (2.5)  can  be  computed  in  a  similar  way. 
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A. 2  Derivation  of  Assertion  1. 


Our  objective  is  an  asymptotic  expansion  of  the  integral 

0    0  j=l   ■'  k=n+l 

where  b.  =  Y.+...+Y  .   To  this  end  rewrite  I„   (Y)  using  the  definition 
J    J      n  N,n  -' 

N  00  -iy(ZA^-S) 

<5(S-  I  A^)  =  ^  /  e         dy  (A. 2) 

k=n+l        -oo 

of  the  Dirac  delta  function  as 

oo  CO  iyS 

n  [b,+s] 
j=i    ^ 

where 

oo 

G(y)   =   /     e"^^''f(x|e)dx.  (A. 4) 

0 

Feynmann's  identity  allows  us  to  set 

n 

1    1  da, . . .da  6(1-  7  a. ) 
n       _  1     n     .^  J 

n  [b.+S]  ^  =  (n-1):  /  .../ ^^^ ;  (A. 5) 

^=^  '  °    °    [S+l   a.b.]" 

j=l  J  J 

substituting  the  right  hand  side  in  (A. 3) 

1     1 


n 


Im  (Y)  =  ^ilr)l   ^T^]^'   I   ■'■!     da-... da  6(1-  [  a. 

N.n  —     2Trr(N-n+l)    •"     •*     1     n     .  ,  l 

'  0    0  j=l  -" 


00        CD        iyS 

X  /   dy  /     \     ^^     .  [G(y)]^-" 

°  (S+  I   a.b.)" 
j=l  ^  ^ 


(A.6) 
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Figure  (A.l) 


Caption:   Contour  over  which  (A. 9)  Is  evaluated  in  the  complex  y  plane. 


Im  y 


t 


->- 


log  y+2iTi 


log  y 


> 


Re  y 
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n 

Let  K  =  y  a.b..   We  need  to  evaluate 

j=l  J  ^ 


00 

/  e^y^(S+K)~^dS  =  e"^y^[g(y)+h(y)logy]  (A. 7) 

0 

where  g(y)  and  h(y)  are  analytic  functions  in  the  lower  half  plane  of 
complex  y  and  B>0.   The  functions  g(y)  and  h(y)  are  explicitly  determined 
below.   In  order  to  integrate  the  RHS  of  (A.  7)  on  (-°°,«')  ,  we  note  that 
we  can  close  the  contour  in  a  semicircle  in  the  lower  y  plane,  and  the 
only  non-vanishing  contribution  comes  from  the  logarithmic  singularity 
at  y  =  0.   Thus,  g(y)  does  not  contribute  to  (A. 7),  namely 

oo 

/  e-^y^(y)[G(y)]N-%  =   0.  (A. 8) 


—CO 


We  now  deform  the  contour  of  integration  around  the  cut  y-plane  as 

seen  in  figure  (A.l).   Across  the  cut  there  is  a  discontinuity  of  27Ti. 

More  explicitly  we  have 

00  -igy  N-n      0  -iyB  N-n 

/e    k(y)logy[G(y)]   dy  =  /  e    (2TTi+logy)h(y)  [G(y)  ]   dy 


0  -iyB  N-n 

-  /  e    h(y)logy[G(y)]   dy  (A. 9) 


0  -iyB         N-n 
=  2TTi  /  e    h(y)[G(y)]   dy. 


In  particular  (cf.  Erdeyli  et.al.,  I,  A. 2(9)) 

n-1    ,      , . ,       f.    . n-m-1 


g-iyS   (  )  ^     y     OmilL     (111 (A.lOa) 

m=l  K. 


and 
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e-iy\(y)logy  =  -  |^^  e'^y^EldyK) 


=  -  l^lly      e  ^y\og(-iyK)+^(y)  (A. 10b) 

where  ^(y)  is  an  analytic  function  of  y  in  the  lower  half  plane.   Hence 
(A. 8)  is  equal  to 

—CO   ^       ' 

n 
Replacing  K  with  ^  a.b.,  we  obtain 


'  '  n 


^N 


0    0  j=l  -^ 

n  oo  n-1        n  N-n 

X  i   /  y   exp{-iy  I   a  b  } [G(y) ]    dy 
0  j=l  ^  ^ 


(A. 11) 


as  was  to  be  shown. 
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A. 3  Derivation  of  Assertion  2. 


Expand  the  characteristic  function  G(y)  in  a  neighborhood  of  the 

origin: 

M       iM 
G(y)  =  1-iM^Y  -  ^  y'  +  ^  y'  +  OCy").  (A. 12) 

Since  the  region  of  y  which  contains  the  leading  terms  of  the  asymptotic 

expansion  of  (A. 12)  is  y~0(p   ),  a  term  of  the  form  p  y   is  a  term  of 

order  p   .,  so  upon  substituting  (A. 12)  for  G(y)  in  [G(y) ]   and  expanding 
again,  we  obtain 

[G(y)]P  =  exp{plogG(y)} 

"^P^l^  1  1  -3 

=   e  -"    {1-  ^Vy2-ipC3Y3   +  |p2vVVo(p      )} 

where  C^  =    [-  ^M^  +  Im,M„  -  hi^].      That   is, 

1  1  JJ 

^N,n(^>   =  fc^   [("-^>'    /   •••/     da^-.-da^<5(l-  I  a  )  (A.13) 

'^  0  0  3=1  -^ 

exp{-iy[pM  +  I  a.b.]}{l-  ^Vy^ 
^  j=l  J    J 

-   ipC3y'  +|p2v2y'*+0(p"')}dy. 
Use  the  identity 

OO  £      CO 

/  exp{-iy(pa+K)}y°  dy  =  (-)  — ^   /  exp{-iy(pa+K) }y  dy 
0  P   aa'^  0 

to  compute  terms  of  (A.13)  in  terms  of  derivatives  of 


.n  r 
X    X      j      Y 

0 


n-1 
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H(pa)  =  n  [pa+b.]    =  (n-1)!  f    ...f     da,  .  .  .da  (S(l-  Y  a.) 
j  =  l     ■'  o    o  j=l 


°°   _i  " 

X   l"  /  y"   exp{-ly[pa+  I   a.b.]}dy 
0  j=i  J  .1 

Straightforward  computation  yields 

J, 
-i   9  H(pa)   ,,,      .       ,      .    _.  -n-?,, 
P   ^"^-^  =  H(pa)gj^(pa)  ~  0(p    ) 

3a 

-  1  -2 

SO  that  in  particular  pg„(pa)  -  0(p   ),   pg  (pa)  ~  0(p   )  and 


bO^r"/   ~  ^  \V  I    >    ffeo' 


-2 


p  g, (pa)  ~  0(p   ).   In  the  statement  of  Assertion  2,  we  rewrite  g„ (pa) , 
for  £  =  2,3,4  as  explicit  functions  g„ (pM^ ,Y)  of  pM  and  Y. 
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A. 4  Derivation  of  Assertion  4. 


~  k 
Write  the  expectation  of  Y   as 

n 


Mt  °°  °o        ,       n     Y.f(YJe)dY. 

o  0  j=l       J  n      ^ 


and  begin  by   rewriting  Y^  =    (Y^+Y2+Y2+. . .+Y^+S)    -    (Y2+Y2+. . .+Y^+S) .      Then 

«>        ,       n     Y.f(Y.|e)dY.         .^, 
0  0  J=2     J  n 


N. 


(N-n)!    J    ••••'  (Y,+Y^+...+Y  +S)  .^    [Y.  +  ...+Y+S]    ^  (^^ID^^ 

0  0  12  n  j=3       J  n 


In  the  second  integral  we  may  replace  Y„  in  the  numerator  by  Y  +Y„  provided 
that  we  multiply  it  by  1/2.   Then  adding  Y  +. . .+Y  +S  to  Y  +Y  and  sub- 
tracting it  from  Y  +Y  ,  this  integral  becomes 


1   XT'    °°    °°   ,   n  Y.f(Y.  e)dy.   ^„ 
i   N.    r  ,    k     _i .1  '-  ^1   f  *N-n 

'•••■'    n   /'   [Y.  +  ...+Y 
0    0     J  =  3  '  J      n 

3 


1    M.    •"    »^nS  //^y^l^^^^Z  n  Y.f(Y.|9)dY.  ,^ 

1  N.    r     r     Z^l 1    l'-   1  .*N-n,  I  . 

2  (N-n):  ^  ••••'     (Y  +Y^+.  .  .+Y  +S)    .  ,  [Y.  +  ...Y+S]       <.^\l)a^ 

0    0     1  2      n      j=4   3     n 


Hence, 

n  Y.f(Y.|e)dY 


(A. 14) 


E 


'   '   0    0     J=2  '  J      n   ' 

1    xTi     «=    «>   ,   n  Y.f(Y.|e)dY. 
o    0     J  =  3   .1      n   ^ 
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k    ' 


oo      oo 


^0^3  /,  ^^yi\^^^yi     n  Y.f(Y.|e)dY. 


^2  (N-n):  -'^•••■'^  (Y,+Y,+  ...+Y  +S)      .",  [Y.  +  ...Y+S]  ^     (S|e)dS 
0    0   1  2      n  '     3=4   j     n 


Repeating  this  decomposition  n-1  times,  we  find  that 


M  n-1            oo   oo  Y  ^"^^fCY  |e)dY   ^„  ^„ 
Tj/v  ^N     f^\      V   /  i\t''^~^\    f        f     _D n'—   n  ^*N-n+£ ,_  I -.  ,„ 

£=0  0   0       n 


Using  the  identity 


CO  -A(Y  +S) 
/  e    "   dA  =  ^ 


Y  +S  ' 
n 


a  generic   integral   in   the  sum  is 


oo     oo     oo  -X(Y   +S) 

/      /      /     Y  '^"'^   e  '^        f(Y    |e)f*^"''"^\s|e)dSdY   dA 

•'•'•'        n  n'—  '—  n 

0       0       0 


oo     (k+1)  n-n+a 

=  /     L  (A)[L(A)]  dA 

0 


so  that 


oo      (k+1)  N-n  n-1  £     _  £ 

E(Y  ^   =  nO    /     L  (A)[L(A)]  I      (-1)    ("/)  [L(A)  ]   dA 


0  £=0 


which  equals  (3.4) 
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A. 5  Computation  of  E(Y  Y   )  For  Exponential  Superpopulations 


Using  a  Feynman  parameterization,  the  marginal  expectation  of  "nearest 
neighbor"  cross  products,  E(Y  Y  ^)   may  be  expressed  as 

n(n+l)(  M/  /  AL^""(A)  L"(X)L"(aX)  [l-L(aA)  ]""^dadX. 
^"  ^  0  0 

This  representation  generalizes,  but  requires  a  (k+1)  fold  integration 

(k  Feynman  parameters)  for  evaluation  of  E(Y  Y  ,,). 

n  n+K 

The  integral  above  says  that  if  L(A)  =  (1+A)~  ,  then  E(Y  Y  J  = 

n  n+1 


4n(n+l)Ly  J(n+1,N)  with 


"     -.n        ^    n-1, 
J(n+1,N)  =  /  V;—  {  /    ^  %   UX 

The  integral  in  curly  brackets  is  equal  to  F  (n+2,n;n+l;-A)/n,   F  a 
Gaussian  hypergeometric  function.   Using  the  recursion  relation 

(n+1) [n-X]2F^(n+2,n;n+l;-X) 

=  n(n+l)(l+A)2Fj^(n+2,n,n;-A)  -A2F^(n+2,n;n+2;-A) 

=  n(n+l)(l+A)"^""^^^-A(l+A)"''  =  (l+A)"^""^^^  [n(n+l)  -  A(l+A)] 


we  have 


Thus 


=  (1+A)  ^""^^^[(n-AXn+A)  +  (n-A)  ] . 


(n+l)2Fj^(n+2,n;n+l;-A)   =   (n+l+A)  (1+A)  ^'''^^^ 


J(n+1,N)  =  -  / 


"  0 


+ ^   dA 


Ld+A)^"^^   (n+Dd+A)^"^^- 


-  B(n+l,N-n+2)  +   Z^,.  B(n+2,N-n+l) 
n  n(n+l) 


mir-  '"-"-^ 
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Hence 


as   in    (2.5) 


^Vn+1>    =   ^'^("-^^^(n+l)    -^("-^1'^)    =   ^[l-]^ni-^] 
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